========================================================
Data soruce : Prosper Loan data
I choose to explore dataset of the “Loan data from prosper”, from Prosper.com Prosper is a peer-to-peer lending marketplace. Borrowers make loan requests and investors contribute the loans of their choice. Once the process is complete, borrowers make fixed monthly payments and investors receive a portion of those payments directly to their Prosper account. I am interested to explore which factor will yield effectiveness benefit for the borrowers and the lender.
## [1] 113937 81
## Classes 'tbl_df', 'tbl' and 'data.frame': 113937 obs. of 81 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
## $ CreditGrade : Factor w/ 9 levels "","A","AA","B",..: 5 1 8 1 1 1 1 1 1 1 ...
## $ Term : int 36 36 36 36 36 60 36 36 36 36 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating..numeric. : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating..Alpha. : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory..numeric. : int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerState : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
## $ Occupation : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
## $ EmploymentStatus : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
## $ CurrentlyInGroup : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
## $ GroupKey : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
## $ DateCreditPulled : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent..percentage. : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
## $ IncomeVerifiable : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
## $ LoanOriginationQuarter : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
I will begin by explore the trend of loan on prosper.com since originated.
The Propser loans had two timing period.The first period from 2005/Q4 to July 2008/Q4, the second period start from 2009/Q2 to 2014/Q1 In 2009/Q1 they were enters a ‘quiet period’ while seeking regulatory approvals by SEC The loans had rising trend Y-Y.
Take an look which kind of loan purpose
## Not Available Debt Consolidation Home Improvement
## 58308 7433 7189
## Business Personal Loan Student Use
## 2395 756 2572
## Auto Other Baby&Adoption
## 10494 199 85
## Boat Cosmetic Procedure Engagement Ring
## 91 217 59
## Green Loans Household Expenses Large Purchases
## 1996 876 1522
## Medical/Dental Motorcycle RV
## 304 52 885
## Taxes Vacation Wedding Loans
## 768 771 0
## NA's
## 16965
The loan with “Not Available” had near by 60k listing on prosper.com If consider loan type excluding “Not Available” the top three of loan are for “Auto”, “Debt Consolidation” and “Home Improvement” respectively.
Next, see the Borrower’s interest rate on the loan.
BorrowerRate are approximately normally distributed with mean and median are 0.19 and 0.18 respectively. EstimatedEffectiveYield also approximately normally distributed and look seem identical with BorrowerRate.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.1340 0.1840 0.1928 0.2500 0.4975
How much return yield of the Lender for their funded.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.0100 0.1242 0.1730 0.1827 0.2400 0.4925
Which available terms for loans
## 12 36 60
## 1614 87778 24545
The term of loan have 12/36/60 months lenght, and the lenght 36 month is most popular choice.
Check which amount of loan of prosper.com
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6500 8337 12000 35000
The loans amount are between 0 and 35,000 minimum loan is 1000$, 75% of loans are under 12,000
Take alook at income range for prosper borrower.
## Not displayed Not employed $0 $1-24,999 $25,000-49,999
## 7741 806 621 7274 32192
## $50,000-74,999 $75,000-99,999 $100,000+
## 31050 16916 17337
Most of the borrowers have income ranging from $25,000 to $75,000, a few loans are had incomes below $25,000.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 3200 4667 5608 6825 1750000
75% of borrower had the monthly incomes < 6,825, but the maximum is 1,750,000$, I think this is not make sense
Check The ratio of debt to income for borrower
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 0.140 0.220 0.276 0.320 10.010 8554
Almost borrowers have 22% in Debt compared with their income. 75% of the borrowers have debt less than 32% compared with their income
Investigate the rating of borrower
## AA A B C D E HR NC NA NA's
## 5372 14551 15581 18345 14274 9795 6935 0 0 29084
By ignore “N/A”, most borrower made with rating C, B, A, and D by Prosper Rating.
Now looking at investor information
How many investor who invested on prosper
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 2.00 44.00 80.48 115.00 1189.00
The number of investors approximately lognormal distribution, a few investor who invest with huge loans.
Check the yeild for investor
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.005 0.042 0.072 0.080 0.112 0.366 29084
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.005 0.042 0.072 0.080 0.112 0.366 29084
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -0.183 0.074 0.092 0.096 0.117 0.284 29084
Estimat loss and return for investor look like indentical 12% and 11%
The dataset contains 81 variables about 113937 loans made through the prosper.com marketplace. The loans cover the period 2005-11-15, 2014-03-12. Variables are of classes int, numeric, date, and factor.
the main features of interest for me are the Lender Yield and Borrower Rate. I want to know what is the factors influencing to the Lender Yield and borrower rate.
I’m interested in ProsperRating, CreditScore, IncomeRange, Term, DebtToIncomeRatio and LoanCategory may help to investigate the LenderYield and BorrowerRate.
I created a single CreditRange that represent the average of CreditScoreRangeUpper and CreditScoreRangeLower.
Also created a factor variable result of ListingCategory and TermInmounth variables.
The ListingCategory, IncomeRange features show unorder, I had reorder for better plot. I transformed StatedMonthlyIncome and DebtToIncomeRatio, which include long tail large value that seems outlier distribution.
Next start explore the corelation between feature in the data set
Start with trend of loan category since loan originations
Since 2007 Prosper had seven categories of loan, until in 2011, they has expanded to additional 13 loan, including all of 20 categories available for now
Small loan was prefer for short term and big loan prefer for long term.
If exclude “Not Applicable”, and “Other”, the amount of lons for Home improvement and Baby&Adoption are majority choice on Prosper
## Source: local data frame [12 x 3]
##
## group sum ratio
## (fctr) (int) (dbl)
## 1 Cancelled 8500 0.00
## 2 Chargedoff 76735809 8.08
## 3 Completed 235643536 24.81
## 4 Current 586174602 61.71
## 5 Defaulted 32550755 3.43
## 6 FinalPaymentInProgress 1710955 0.18
## 7 Past Due (>120 days) 132500 0.01
## 8 Past Due (1-15 days) 6825567 0.72
## 9 Past Due (16-30 days) 2161454 0.23
## 10 Past Due (31-60 days) 3097964 0.33
## 11 Past Due (61-90 days) 2419496 0.25
## 12 Past Due (91-120 days) 2433209 0.26
Loan status look pretty good, past due < 2%
##
## Pearson's product-moment correlation
##
## data: pdf$StatedMonthlyIncome and pdf$LoanOriginalAmount
## t = 69.353, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1956816 0.2068243
## sample estimates:
## cor
## 0.2012595
Higher income borrower who can made higher loan amound, this may related payment possiblity.
Look seem Credit score for home owner high than not own the home
The number of loan by term and credit rating
36 month loan is the majority selection for all credit rating borrower. The investor may need to focus to putting their money this these loan peroid.
The most borrower on prosper had income range between 25000$ - 100000$ which rating from AA-HR Most excellent rating (B-AA) borrower income range over 75000$ also most poor rating (D -HR) borrower income range between 25000$ - 50000$
##
## Pearson's product-moment correlation
##
## data: pdf$ProsperRatingScore and pdf$CreditScoreRange
## t = 191.27, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.544155 0.553558
## sample estimates:
## cor
## 0.5488738
High Credit Score is assign to excellent rating (AA), and surprise is Poor prosper rating “HR” have high credit score that rating E
##
## Pearson's product-moment correlation
##
## data: pdf$CurrentDelinquencies and pdf$CreditScoreRange
## t = -133.37, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3734729 -0.3634055
## sample estimates:
## cor
## -0.36845
Delinquencies count seem tobe relation with the Credit score, low credit score borrower who increasing chance to made high delinquencies.
## CreditRating: AA
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04000 0.06990 0.07790 0.07912 0.08450 0.21000
## --------------------------------------------------------
## CreditRating: A
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0498 0.0990 0.1119 0.1129 0.1239 0.2150
## --------------------------------------------------------
## CreditRating: B
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0693 0.1414 0.1509 0.1545 0.1639 0.3500
## --------------------------------------------------------
## CreditRating: C
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0895 0.1765 0.1914 0.1944 0.2099 0.3500
## --------------------------------------------------------
## CreditRating: D
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1157 0.2287 0.2492 0.2464 0.2625 0.3500
## --------------------------------------------------------
## CreditRating: E
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1479 0.2712 0.2925 0.2933 0.3149 0.3600
## --------------------------------------------------------
## CreditRating: HR
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1779 0.3134 0.3177 0.3173 0.3177 0.3600
## --------------------------------------------------------
## CreditRating: NC
## NULL
## --------------------------------------------------------
## CreditRating: NA
## NULL
##
## Pearson's product-moment correlation
##
## data: pdf$ProsperRatingScore and pdf$BorrowerRate
## t = -917.37, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9537172 -0.9524846
## sample estimates:
## cor
## -0.9531049
The excellent rating is AA and is lower risk and then make lower interest. In the other hand, poor rating is HR that high risk, so reqire hig interest too.
##
## Pearson's product-moment correlation
##
## data: pdf$BorrowerRate and pdf$LoanOriginalAmount
## t = -117.58, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3341283 -0.3237719
## sample estimates:
## cor
## -0.3289599
Small loan make high interest than bigger loan, this related to length of loan.
Borrower who home owner had receive small interest than other one who not own the home.
Now look at investor sode
Mostt investment term on propser.com are 36 and 60 months .
##
## Pearson's product-moment correlation
##
## data: pdf$Investors and pdf$LenderYield
## t = -96.233, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2795354 -0.2687953
## sample estimates:
## cor
## -0.2741739
There are few investors who made higher yield not much correlation between these two variables. negative correlation -0.27
For the borrower, BorrowerRate depend on CreditRating, Income, Term and Loand Amount. For Investor the retrun depend on BorrowerRate Term and Amount of their funded.
I also observe when the borrower who are home owner, they average interest rate is smaller.
I found the strongest relationshipbetween CreditRating and BorrowerRate (0.95).
This section, main objective is to see how the observed relationships from the previous section.
The total amount are growth up continuous over the time, especially since year 2013, the loan amount growth over 300% Consider on loan type that not avialable have double growth ration over time. The purpose of the loan are to be used for improve the quality of life rather than entertainment. The top three loan type are for Auto, Debt consolidation.
Borrower who had more income they increase a chance to increase their CreditRating and can lend more money with better interest.
36 month term make better investment yield for investor.
The lender yield are explicitly different for each Credit Rating. Excellent rating borrower made low return but high return made by poor rating borrower
This plot show since FY2011 most investor have no loss for theire investment. The most investment term is 36 and 60 mounth, and from FY2014 there is no investment fro short term.
The Most Popular Loan is 36 months for all credit rating. The Lender Yield is gradually increasing from decreaseing Credit of borrower
My interesting that the ratio of BorrowerRate and CreditRating that explicitly different
I did not create a model for my dataset.
Loan Status by percentage since originated throug FY2017
##
## 2005 Q4 2006 Q1 2006 Q2 2006 Q3 2006 Q4 2007 Q1
## Cancelled " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Chargedoff " 0.00" " 7.00" " 13.00" " 15.00" " 20.00" " 23.00"
## Defaulted " 0.00" " 20.00" " 22.00" " 25.00" " 23.00" " 18.00"
## Past Due " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Current " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Completed "100.00" " 73.00" " 65.00" " 60.00" " 57.00" " 58.00"
##
## 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3
## Cancelled " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Chargedoff " 27.00" " 27.00" " 25.00" " 24.00" " 25.00" " 23.00"
## Defaulted " 14.00" " 12.00" " 11.00" " 10.00" " 9.00" " 9.00"
## Past Due " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Current " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Completed " 59.00" " 61.00" " 64.00" " 66.00" " 66.00" " 68.00"
##
## 2008 Q4 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2
## Cancelled " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Chargedoff " 20.00" " 15.00" " 9.00" " 12.00" " 12.00" " 14.00"
## Defaulted " 8.00" " 8.00" " 5.00" " 3.00" " 3.00" " 3.00"
## Past Due " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Current " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Completed " 72.00" " 77.00" " 86.00" " 84.00" " 85.00" " 83.00"
##
## 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4
## Cancelled " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Chargedoff " 12.00" " 14.00" " 15.00" " 18.00" " 16.00" " 15.00"
## Defaulted " 3.00" " 3.00" " 3.00" " 4.00" " 3.00" " 3.00"
## Past Due " 0.00" " 0.00" " 1.00" " 3.00" " 2.00" " 4.00"
## Current " 0.00" " 1.00" " 10.00" " 28.00" " 33.00" " 36.00"
## Completed " 84.00" " 81.00" " 71.00" " 47.00" " 46.00" " 43.00"
##
## 2012 Q1 2012 Q2 2012 Q3 2012 Q4 2013 Q1 2013 Q2
## Cancelled " 0.00" " 0.00" " 0.00" " 0.00" " 0.00" " 0.00"
## Chargedoff " 14.00" " 13.00" " 11.00" " 8.00" " 3.00" " 2.00"
## Defaulted " 3.00" " 2.00" " 2.00" " 1.00" " 0.00" " 0.00"
## Past Due " 3.00" " 4.00" " 5.00" " 5.00" " 4.00" " 3.00"
## Current " 46.00" " 51.00" " 56.00" " 63.00" " 73.00" " 85.00"
## Completed " 34.00" " 30.00" " 26.00" " 23.00" " 19.00" " 9.00"
##
## 2013 Q3 2013 Q4 2014 Q1
## Cancelled " 0.00" " 0.00" " 0.00"
## Chargedoff " 0.00" " 0.00" " 0.00"
## Defaulted " 0.00" " 0.00" " 0.00"
## Past Due " 3.00" " 2.00" " 0.00"
## Current " 91.00" " 96.00" " 99.00"
## Completed " 6.00" " 3.00" " 1.00"
The demand of loan had growth rapidly year by year since FY2011. This plot show the loans status on Prosper.com over 99%(as of FY 2009) are “Current” status. Since originated year the ratio of bad loan (Cancelled, Chargedoff, efaulted and Past Due) Look seem too high ratio ~30-40% during 2006-2008 and there are improve after 2009, and right now the loan status are so good. ratio of bad loan small than 3% since 2013 till now(Q1/2014) Any way, one of intersting is the result of complete status show dedreasing every year, as of 2014 complete ratio remaining 1% of result.
Summary of Interest rate by credit rating
## CreditRating: AA
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04000 0.06990 0.07790 0.07912 0.08450 0.21000
## --------------------------------------------------------
## CreditRating: A
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0498 0.0990 0.1119 0.1129 0.1239 0.2150
## --------------------------------------------------------
## CreditRating: B
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0693 0.1414 0.1509 0.1545 0.1639 0.3500
## --------------------------------------------------------
## CreditRating: C
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0895 0.1765 0.1914 0.1944 0.2099 0.3500
## --------------------------------------------------------
## CreditRating: D
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1157 0.2287 0.2492 0.2464 0.2625 0.3500
## --------------------------------------------------------
## CreditRating: E
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1479 0.2712 0.2925 0.2933 0.3149 0.3600
## --------------------------------------------------------
## CreditRating: HR
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1779 0.3134 0.3177 0.3173 0.3177 0.3600
## --------------------------------------------------------
## CreditRating: NC
## NULL
## --------------------------------------------------------
## CreditRating: NA
## NULL
Summary of loan amount by credit rating
## CreditRating: AA
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 6000 10940 11580 16000 35000
## --------------------------------------------------------
## CreditRating: A
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 5850 10000 11460 15000 35000
## --------------------------------------------------------
## CreditRating: B
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 6000 10000 11620 15000 35000
## --------------------------------------------------------
## CreditRating: C
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 5000 10000 10390 15000 25000
## --------------------------------------------------------
## CreditRating: D
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6100 7083 10000 15000
## --------------------------------------------------------
## CreditRating: E
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 3600 4000 4586 5000 15900
## --------------------------------------------------------
## CreditRating: HR
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 3000 4000 3463 4000 16800
## --------------------------------------------------------
## CreditRating: NC
## NULL
## --------------------------------------------------------
## CreditRating: NA
## NULL
This chart shows the relationship of interest rates and loans on www.prosper.com Prosper offering interest rates in range as 4-40% while loans are in the range 1000-35000 $. The interest rate and loan amount depend on the borrower’s credit rating. We observed that the interest rate for excellent credit borrower’s are typically lower than the borrower who get poor credit rating. By considering borrowers who had excellent - very good (A-AA) credit, they have opportunity to receive interest rate between 4-20% and can borrow in range 1000-35000 $. And for the borrowers with poor credit ratings - moderate (HR-B) to the interest rate will be in the range of 10-40% and be able to borrow between 1000-35000 $ for the borrowers with moderate credit ratings (C-B) and from $ 1000-17000 for the borrower with the poor credit. However, up to the limit in trouble for something else, such as the term of loan, type of loan, etc.
This plot shows that the borrower have generated impressive profits on prosper.com since 2009-2012 , we can see the ROI for 12-month short-term investment generated return ~ 4-10%, and or medium-term loans as 36 months, the return is 10-40%, and long-term loans for 60 months also made very hig return up to 20-60%. The return is much higher when ther borrower has a poor credit rating. Any way, after the year 2013, the return on investment fell for all loan team in all credit rating, especially in 2014 the return on investment declined sharply.
The Prosper dataset has 113,937 record with 81 variables. I started by looking at the documentation and tried to selected 10-15 interesting variables and planed to use various plots to check and explain the reationship betweeen them. The difficulties I had with the data mainly stemmed from understanding the variables from this dataset and asking interesting questions, then planned selecting the appropriate technique to analyze. From my analyze I found the factor that will make to the borrower get lower interest is to make the excellent credit ratinng by increase increasign income, lower debt ratio or how the home. For investor the make high return they may explore the investment to these state such as Mississippi, Oregon, Iowa, Nebraska, North Dakota, Indiana, Ohio.
To continue from here, I would like to make a predictive model to compare against the data. I am looking forward to learning more about modeling and predictions in the Machine Learning class.